Improving Contrast Set Mining
نویسندگان
چکیده
A fundamental task in exploratory data analysis is discerning the differences between contrasting groups. Contrast set mining has been developed as a data mining task, which aims to identify the differences between these groups. This paper examines the algorithms, heuristics, and open issues of contrast set mining, and seeks to improve contrast set mining by addressing several of the open issues. It proposes four interestingness measures for ranking contrast sets: coverage, overall support, growth rate, and unusualness. It introduces a new method to discretize quantitative attributes. A new type of contrast set, called the jumping contrast set, is defined, and the contrast set mining process is modified, to include mining both types of contrast sets, on datasets containing both quantitative and categorical attributes. Finally, a simple visualization method is introduced, to describe contrast sets to the end-user.
منابع مشابه
Supporting Factors to Improve the Explanatory Potential of Contrast Set Mining: Analyzing Brain Ischaemia Data
The goal of exploratory pattern mining is to find patterns that exhibit yet unknown relationships in data and to provide insightful representations of detected relationships. This paper explores contrast set mining and an approach to improving its explanatory potential by using the so called supporting factors that provide additional descriptions of the detected patterns. The proposed methodolo...
متن کاملMining Interesting Contrast Sets
Contrast set mining has been developed as a data mining task which aims at discerning differences across groups. These groups can be patients, organizations, molecules, and even time-lines. A valid contrast set is a conjunction of attribute-value pairs that differ significantly in their distribution across groups. The search for valid contrast sets can produce a prohibitively large number of re...
متن کاملContrast Set Mining Through Subgroup Discovery Applied to Brain Ischaemina Data
Contrast set mining aims at finding differences between different groups. This paper shows that a contrast set mining task can be transformed to a subgroup discovery task whose goal is to find descriptions of groups of individuals with unusual distributional characteristics with respect to the given property of interest. The proposed approach to contrast set mining through subgroup discovery wa...
متن کاملA novel feature selection techniques based on contrast set mining
Data classification is a challenging task in era of big data due to high number of features. Feature selection is a step in process of knowledge discovery in data that aims to reduce dimensionality and improve the classification performance. The purpose of this research is to define new techniques for feature selection in order to improve classification accuracy and reduce the time required for...
متن کاملMining Discrimination Patterns along Temporal Databases
In certain Data Analysis tasks, understanding the underlying differences between groups or classes is of the utmost importance. Contrast Set Mining relies on discovering significant patterns by contrasting two or more groups. A Contrast Set is a conjunction of attribute-value pairs that differ meaningfully in its distribution across groups. One technique proposed is Rules for Contrast Sets (RCS...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008